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Abstract. We formulate query-subquery nets and use them to create the first framework 
for developing algorithms for evaluating queries to Horn knowledge bases with the properties 
that: the approach is goal-directed; each subquery is processed only once and each supplement 
tuple, if desired, is transferred only once; operations are done set-at-a-time; and any control 
strategy can be used. Our intention is to increase efficiency of query processing by eliminating 
redundant computation, increasing flexibility and reducing the number of accesses to the 
secondary storage. The framework forms a generic evaluation method called QSQN. To deal 
with function symbols, we use a term-depth bound for atoms and substitutions occurring in 
the computation and propose to use iterative deepening search which iteratively increases the 
term-depth bound. We prove soundness and completeness of our generic evaluation method 
and show that, when the term-depth bound is fixed, the method has PTIME data complexity. 
We also present how tail recursion elimination can be incorporated into our framework and 
propose two exemplary control strategies, one is to reduce the number of accesses to the 
secondary storage, while the other is depth-first search. 

Keywords: query processing, Datalog, Horn knowledge bases, QSQ, QSQR, QSQN, 
QSQTRE, magic-set transformation 

1 Introduction 

Horn knowledge bases are definite logic programs, which are usually so big that either they 
cannot be totally loaded into the computer memory or evaluations for them cannot be 
done totally in the computer memory. Thus, in contrast to logic programming, for Horn 
knowledge bases efficient access to the secondary storage is an important aspect. Horn 
knowledge bases can be treated as extensions of Datalog deductive databases without the 
range-restrictedness and function- free conditions. 

This work studies query processing for Horn knowledge bases. It is a continuation of 
Madalihska-Bugaj and Nguyen's work [7J. As argued in [7J, the Horn fragment of first- 
order logic plays an important role in knowledge representation and reasoning. The QSQN 
(query-subquery net) evaluation method provided in the current paper is essentially differ- 
ent from the QSQR (query-subquery recursive) method of [7J. However, some introductory 
and preliminary texts are borrowed from [7J. 

An efficient method for evaluating queries to Horn knowledge bases should: 

— be goal-directed, i.e. the computation should be closely related to the given goal 

— be set-oriented (instead of tuple-oriented) in order to reduce the number of accesses to 
the secondary storage 

— do no redundant computation (or do it as less as possible). 

As discussed in [7J, to develop evaluation procedures for Horn knowledge bases one can 
either adapt tabled SLD-resolution systems of logic programming to reduce the number of 
accesses to the secondary storage or generalize evaluation methods of Datalog to deal with 
non-range-restricted definite logic programs and goals that may contain function symbols. 
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Tabled SLD-resolution systems like OLDT [17], SLD-AL [198 20], linear tabulated res- 
olution |14|21] are efficient computational procedures for logic programming without re- 
dundant recomputations, but they are not directly applicable to Horn knowledge bases to 
obtain efficient evaluation engines because they are not set-oriented (set-at-a-time). In par- 
ticular, the suspension-resumption mechanism and the stack-wise representation as well as 
the "global optimizations of SLD-AL" are all tuple-oriented (tuple-at-a-time). Data struc- 
tures for them are too complex so that they must be dropped if one wants to convert the 
methods to efficient set-oriented ones. The try of converting XSB [12|13j (a state-of-the-art 
implementation of OLDT) to Breadth-First XSB [5j as a set-oriented engine [5] for Horn 
knowledge bases removes essential features of XSB. Besides, as shown in Example II .![ the 
breadth- first approach is not always efficient. 

As well-known evaluation methods for Datalog deductive databases, there are the top- 
down methods QSQR [IS], QoSaQ [2D], QSQ |18|2U|1| and the bottom-up method based on 
magic-set transformation and seminaive evaluation |3|10|1] , As the QSQ approach (includ- 
ing QSQR and QoSaQ) is based on SLD-resolution and the magic-set technique simulates 
QSQ, all of the mentioned methods are goal-directed. 

The first version of the QSQR (query-subquery recursive) evaluation method was for- 
mulated by Vieille in [18J for Datalog deductive databases. It is set-oriented and uses a tab- 
ulation technique. That version is incomplete [20J8J. As pointed out by Mohamed Yahya [7J, 
the version given in the book [I] by Abiteboul et al. is also incomplete. In [7J, Madalinska- 
Bugaj and Nguyen corrected and generalized the method for Horn knowledge bases. The 
correction depends on clearing global input, relations for each iteration of the main loop. 
As observed by Vieille [20], the QSQR approach is like iterative deepening search. It allows 
redundant recomputations (see [7J Remark 3.2]). 

The QoSaQ evaluation method [20J is Vieille's adaptation of SLD-AL resolution for 
Datalog deductive databases. This evaluation method can be implemented as a set-oriented 
procedure, but Vieille stated that "We would like, however, to go even further and to claim 
that the practical interest of our approach lies in its one-inference-at-a-time basis, as op- 
posed to having a set-theoretic basis. First, this tuple-based computational model permits a 
fine analysis of the duplicate elimination issue. ..." [20j page 5]. Moreover, the specific tech- 
niques of QoSaQ like "instantiation pattern", "rule compilation", "projection" are heavily 
based on the range-restrictedness and function- free conditions. 

The magic-set technique |3|10j for Datalog deductive databases simulates the top-down 
QSQR evaluation by rewriting a given query to another equivalent one that when evaluated 
using a bottom-up technique (e.g. the seminaive evaluation) produces only facts produced 
by the QSQR evaluation. Some authors have extended the magic-set technique for Horn 
knowledge bases |9|5j . The bottom-up techniques usually use breadth-first search, and as 
shown in Example ll.il are not always efficient. The magic-set transformation does not help 
for the case of that example. 



Example 1.1. The order of program clauses and the order of atoms in the bodies of program 
clauses may be essential, e.g., when the positive logic program defining intensional predicates 
is specified using the Prolog programming style. In such cases, the top-down depth-first 
approach may be much more efficient than the breadth-first approaches (including the one 
based on magic-set transformation and bottom- up seminaive evaluation). Here is such an 
example, in which x, y, z denote variables and dj, bij denote constant symbols: 
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— the positive logic program (for defining intensional predicates p, q\ and (72) : 

p <- gi(a ,aiooo) 
p <- q 2 (a ,a WO o) 

qi{x,y) <- n(x,y) 
qi(x,y) 4- r 1 (x,z),q 1 (z,y) 

l2(x,y) <- r 2 (x,y) 
q2(x,y) 4- r 2 (x,z),q 2 (z,y) 

— the extensional instance (for specifying extensional predicates r\ and r 2 ): 

J(ri) = {{a h a i+1 ) I < i < 1000} 

J(r 2 ) = {(a ,fei, i ) I 1 <j< 1000} U 

{(6y, k +hj ) I 1 < i < 999 and l<j< 1000} U 
{(b999j,aiooo) I 1 < 3 < 1000} 

i.e., 




2,1000 




«iooo 



1000 



— the goal: <— p. 

Our postulate is that the breadth-first approaches (including the evaluation method 
based on magic-set transformation and bottom-up seminaive evaluation) are too inflexible 
and not always efficient. Of course, depth-first search is not always good either. < 

The QSQ (query-subquery) approach for Datalog queries, as presented in [1J, originates 
from the QSQR method but allows a variety of control strategies. The QSQ framework j!8|lj 
uses adornments to simulate SLD-resolution in pushing constant symbols from goals to 
subgoals. The annotated version of QSQ also uses annotations to simulate SLD-resolution 
in pushing repeats of variables from goals to subgoals (see [1]). 

In this paper we generalize the QSQ approach for Horn knowledge bases. We formulate 
query-subquery nets and use them to create the first framework for developing algorithms 
for evaluating queries to Horn knowledge bases with the following properties: 

— the approach is goal-directed 
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— each subquery is processed only once 

— each supplement tuple, if desired, is transferred only once 

— operations are done set-at-a-time 

— any control strategy can be used. 

Our intention is to increase efficiency of query processing by eliminating redundant compu- 
tation, increasing flexibility and reducing the number of accesses to the secondary storage. 
The framework forms a generic evaluation method called QSQN. Similarly to [7j but in 
contrast to the QSQ framework for Datalog queries [I], it does not use adornments and 
annotations (but has the effects of the annotated version). To deal with function symbols, 
we use a term-depth bound for atoms and substitutions occurring in the computation and 
propose to use iterative deepening search which iteratively increases the term-depth bound. 
We prove soundness and completeness of our generic evaluation method and show that, 
when the term-depth bound is fixed, the method has PTIME data complexity. We also 
present how tail recursion elimination [11] can be incorporated into our framework and 
propose two exemplary control strategies, one is to reduce the number of accesses to the 
secondary storage, while the other is depth-first search. 

The rest of this paper is structured as follows. In Section we recall some notions of 
first-order logic, logic programming, and Horn knowledge bases. In Section [3] we present our 
QSQN evaluation method for Horn knowledge bases. We prove its soundness and complete- 
ness in Section 0] and estimate its data complexity in Section [5j We consider tail recursion 
elimination in Section [6] and propose exemplary control strategies for our method in Sec- 
tion [7J Concluding remarks are given in Section [HJ 

2 Preliminaries 

First-order logic is considered in this work and we assume that the reader is familiar with 
it. We recall only the most important definitions for our work and refer the reader to |6|2j 
for further reading. 

A signature for first-order logic consists of constant symbols, function symbols, and 
predicate symbols. Terms and formulas over a fixed signature are defined using the symbols 
of the signature and variables in the usual way. An atom is a formula of the formp(t^, . . . , t n ^) : 
where p is an n-ary predicate and t%, . . . , t n are terms. An expression is either a term, a tuple 
of terms, a formula without quantifiers or a list of formulas without quantifiers. A simple 
expression is either a term or an atom. The term-depth of an expression is the maximal 
nesting depth of function symbols occurring in that expression. 

2.1 Substitution and Unification 

A substitution is a finite set 9 = {x±/ti, . . . , Xk/tk}, where x±, . . . , Xk are pairwise distinct 
variables, ti, . . . , ty. are terms, and ti ^ Xj for all 1 < i < k. The set dom(9) = {xi, . . . , x^} 
is called the domain of 9, while the set range{9) = {ti, . .. is called the range of 9. 
By e we denote the empty substitution. The restriction of a substitution 9 to a set X of 
variables is the substitution 9\x = {{x/t) 6 9 \ x € X}. The term-depth of a substitution is 
the maximal nesting depth of function symbols occurring in that substitution. 

Let 9 = {xi/ti, . . . , Xk/tk} be a substitution and E be an expression. Then EO, the 
instance of E by 9, is the expression obtained from E by simultaneously replacing all 
occurrences of the variable Xi in E by the term ij, for 1 < % < k. 

Let 9 = {xi/ti, Xk/tk} and 5 = {y%/s\, . . . ,yh/sh} be substitutions (where 
x\,...,Xk are pairwise distinct variables, and yi,---,yh are also pairwise distinct vari- 
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ables). Then the composition 95 of and 5 is the substitution obtained from the se- 
quence {xi/(ti&), . . . ,x k /(t k 6),yi/st,. . -iVh/sh} by deleting any binding for which 
x- L = (ti$) and deleting any binding Uj/sj for which yj G {xi, . . . , Xk\- 

A substitution is idempotent if 0(9 = 0. It is known that = {xi/ti, . . . , x^/t^} is 
idempotent if none of x±, . . . , x k occurs in any t%, . . . , t k . 

If and 5 are substitutions such that 95 = 59 = e, then we call them renaming substitu- 
tions. We say that an expression E is a variant of an expression E' if there exist substitutions 
9 and 7 such that E = E'9 and E' = E-y. 

A substitution 9 is more general than a substitution 5 if there exists a substitution 7 
such that 5 = 9j. Note that according to this definition, 9 is more general than itself. 

Let r be a set of simple expressions. A substitution 9 is called a unifier for P if TO is 
a singleton. If r9 = {</?} then we say that unifies -T (into p). A unifier for r is called a 
most general unifier (mgu) for r if 9 is more general than every unifier of r. 

There is an effective algorithm, called the unification algorithm, for checking whether 
a set r of simple expressions is unifiable (i.e. has a unifier) and computing an idempotent 
mgu for r if P is unifiable (see, e.g., [6]). 

If E is an expression or a substitution then by Vars(E) we denote the set of variables 
occurring in E. If tp is a formula then by V(<£>) we denote the universal closure of p, which 
is the formula obtained by adding a universal quantifier for every variable having a free 
occurrence in p. 

2.2 Positive Logic Programs and SLD-Resolution 

A (positive or definite) program clause is a formula of the form V(^4 V -1P1 V ... V —>Bk) 
with k > 0, written as A <— B\, . . . , S^, where A, B\, . . . , B k are atoms. A is called the 
head, and (£?i, . . . , B^j the body of the program clause. If p is the predicate of A then the 
program clause is called a program clause defining p. 

A positive (or definite) logic program is a finite set of program clauses. 

A goal (also called a negative clause) is a formula of the form V(->-Bi V ... V ^Bj.), written 
as <— . . . , -Bfc, where ... , B k are atoms. If k = 1 then the goal is called a unary goal. 
If = then the goal stands for falsity and is called the empty goal (or the empty clause) 
and denoted by □. 

If P is a positive logic program and G = ^— B\ , . . . , B k is a goal, then is called a 
correct answer for P U {G} if P |= V((Pi A ... A B k )9). 
We now give definitions for SLD-resolution. 

A goal G' is derived from a goal G = <(— Aj, . . . , Ai, . . . , A k and a program clause 
(f = (A <— Bi, . . . , Bh) using A{ as the selected atom and 9 as the most general unifier 
(mgu) if 9 is an mgu for Ai and A, and G' = «— (Al 3 . . . , Aj_i, Pi, . . . , P/j, Aj+i, . . . , Af.)9. 
We call C a resolvent of G and 99. If i = 1 then we say that G' is derived from G and tp 
using the leftmost selection function. 

Let P be a positive logic program and G be a goal. 

An SLD- derivation from P U {G} consists of a (finite or infinite) sequence Go — G, Gi , 
G2, ... of goals, a sequence (pi, tp2, ■ ■ ■ of variants of program clauses of P and a sequence 
01,02; • • • of mgu's such that each Gj+i is derived from Gj and (/?j + i using 0j+i. Each ^ 
is a suitable variant of the corresponding program clause. That is, pi does not have any 
variables which already appear in the derivation up to Gj_i. Each program clause variant 
Pi is called an input program clause. 

An SLD -refutation of P U {G} is a finite SLD-derivation from P U {G} which has the 
empty clause as the last goal in the derivation. 
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A computed answer 9 for P U {G} is the substitution obtained by restricting the com- 
position 9\ ... 9 n to the variables of G, where 9%, . . . , 9 n is the sequence of mgu's occurring 
in an SLD-refutation of P U {G}. 

Theorem 2.1 (Soundness and Completeness of SLD-Resolution [4)116) ). Let P be 

a positive logic program and G be a goal. Then every computed answer for P U {G} is a 
correct answer for PL) {G}. Conversely, for every correct answer 9 for PL) {G}, using any 
selection function there exists a computed answer 5 for P U {G} such that G9 = G5j for 
some substitution 7. < 

We will use also the following well-known lemmas: 

Lemma 2.2 (Lifting Lemma). Let P be a positive logic program, G be a goal, 9 be a 
substitution, and I be a natural number. Suppose there exists an SLD-refutation of PL){G9} 
using mgu's 9\, . . . , 9 n such that the variables of the input program clauses are distinct from 
the variables in G and 9 and the term-depths of the goals are bounded by I. Then there 
exist a substitution 7 and an SLD-refutation of P U {G} using the same sequence of input 
program clauses, the same selected atoms and mgu's 9' 1 ,...,9' n such that the term-depths of 
the goals are bounded by I and 99\ . . . 9 n = 9[ . . . 0' n j. < 

The Lifting Lemma given in |6j does not contain the condition "the variables of the input 
program clauses are distinct from the variables in G and 9" and is therefore inaccurate (see, 
e.g., [2]). The correct version given above follows from the one presented, amongst others, 
in [15]. For applications of this lemma in this paper, we assume that fresh variables from a 
special infinite list of variables are used for renaming variables of input program clauses in 
SLD-derivations, and that mgu's are computed using a standard method. The mentioned 
condition will thus be satisfied. 

In a computational process, a fresh variant of a formula <p, where ip can be an atom, a 
goal A or a program clause A <— B\ , . . . , Bf. (written without quantifiers) , is a formula 
tp9, where 9 is a renaming substitution such that dom(9) = Vars(ip) and range(9) consists 
of fresh variables that were not used in the computation (and the input). 

2.3 Definitions for Horn Knowledge Bases 

Similarly as for deductive databases, we classify each predicate either as intensional or as 
extensional. A generalized tuple is a tuple of terms, which may contain function symbols 
and variables. A generalized relation is a set of generalized tuples of the same arity. A Horn 
knowledge base is defined to be a pair consisting of a positive logic program for defining 
intensional predicates and a generalized extensional instance, which is a function mapping 
each extensional n-ary predicate to an n-ary generalized relation. Note that intensional 
predicates are defined by a positive logic program which may contain function symbols and 
not be range-restricted. From now on, we use the term "relation" to mean a generalized 
relation, and the term "extensional instance" to mean a generalized extensional instance. 

Given a Horn knowledge base specified by a positive logic program P and an extensional 
instance /, a query to the knowledge base is a positive formula ip(x) without quantifiers, 
where x is a tuple of all the variables of c/?|f| A (correct) answer for the query is a tuple t of 
terms of the same length as x such that PU I \= V((/?(i)). When measuring data complexity, 
we assume that P and (p are fixed, while / varies. Thus, the pair (P, (p(x)) is treated as a 
query to the extensional instance I. We will use the term "query" in this meaning. 



3 A positive formula without quantifiers is a formula built up from atoms using only connectives A and V. 
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It can easily be shown that, every query (P, <p(x)) can be transformed in polynomial time 
to an equivalent query of the form (P', q{x)) over a signature extended with new intensional 
predicates, including q. The equivalence means that, for every extensional instance / and 
every tuple t of terms of the same length as x, PUl (= \/{tp{t)) iff P'Ul (= V{q(t)). The trans- 
formation is based on introducing new predicates for defining complex subformulas occur- 
ring in the query. For example, if 92 = p{x)Ar{x, y), then P' = P U {q(x, y) <— p{x), r{x, y)}, 
where q is a new intensional predicate. 

Without loss of generality, we will consider only queries of the form (P,q(x)), where q 
is an intensional predicate. Answering such a query on an extensional instance I is to find 
(correct) answers for P U I U {<— q(x)}. 

3 Query-Subquery Nets 

Let P be a positive logic program and (fx, . . . , ip m be all the program clauses of P, with 

<Pi = {At ^— £?j,i, . . . ,B i>n .) 

where rtj > 0. A query -subquery net structure (in short, QSQ-net structure) of P is a tuple 
(V, P, T) such that: 

— V consists of nodes 

• inpuLp and ansjp for each intensional predicate p of P 

• prcfilter^ filter i l , . . . , filter i n ., posLfilter^^ for each 1 < i < m 

— E consists of edges 

• (filter itl , filter i 2 ), . . . , {filter i)JH _ x , filter i>ni ) for each 1 < i < m 

• {pre.fi.lter j^, filter , i x ) and {filter i , posLfilter^ for each 1 < i < m with nj > 1 

• {pre.filter i , post-filter ^) for each 1 < i < m with nj = 

• {inpuLp, pre-filter^) and {postfilter^ ansjp) for each 1 < i < m, where p is the pred- 
icate of Aj 

• {filter i j, inpuLp) and {ansjp, filter i j) for each intensional predicate p and each 
1 < i < m and 1 < j < rii such that Pj j is an atom of p 

— T is a function, called the memorizing type of the net structure, mapping each node 
filter ij € V such that the predicate of Pjj is extensional to true or /ake. 

If (v, w) G P then we call u> a successor of and v a predecessor of Note that 1/ and E 
are uniquely specified by P. We call the pair (V, E) the topological structure of P. 

Example 3.1. Figure Q] illustrates the QSQ topological structure of the following positive 
logic program: 

p{x,y) <- q{x,y) 
p{x,y) <- q{x,z),p{z,y). 

A query- subquery net (in short, QSQ-net) of P is a tuple = (y, P, T, C) such that 
{V, E, T) is a QSQ-net structure of P and C is a mapping that associates each node v € V 
with a structure called the contents of t> , satisfying the following conditions: 

— C{v), where v = inpuLp or v = ansjp for an intensional predicate p of P, consists of: 

• tuples {v) : a set of generalized tuples of the same arity as p 

• unprocessed {v,w) for {v,w) € P: a subset of tuples{v) 

— C{v), where v = pre-filter i , consists of: 

• atom{v) = Ai and posLvars{v) = Vars{{Bi t i, . . . ,Pj in J) 
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Fig. 1. The QSQ topological structure of the program given in Examplc l3,ll 



— C(v), where v = posLfilter^ is empty, but we assume pre-vars(v) = 

— C(v), where v = filter y and p is the predicate of Bij, consists of: 

• kind{v) = extensional if p is extensional, and kind(v) = intensional otherwise 

• pred(v) = p and atom(v) = Bij 

• pre-vars(v) = Vars((B it j, . . . , B im )) and posLvars(v) = Vars((Bij +1 , . . . , B i>ni )) 

• subqueries(v): a set of pairs of the form (t, 5), where t is a generalized tuple of the 
same arity as the predicate of Ai and 5 is an idempotent substitution such that 
dom{5) C pre.vars(v) and dom(5) n Vars(t) = 

• unprocessecLsubqueries(v) C subqueries(v) 

• in the case p is intensional: 

* unprocesse<Lsubqueries2(v) C subqueries{v) 

* unprocesse<Ltuples{v) : a set of generalized tuples of the same arity as p 

— if v = filtertj, kind(v) = extensional and T(v) = false then subqueries{v) = 0. 

By a subquery we mean a pair of the form (i, 8) , where t is a generalized tuple and 5 is 
an idempotent substitution such that dom(5) n Vars(t) = 0. 

For v = filter i j and p being the predicate of Ai, the meaning of a subquery (t,5) € 
subqueries(v) is that: for processing the goal <— p(s) with s G tuples (inpuLp) using the 
program clause (fi = (Ai ^— JS^i, . . . , Bi )TH ), unification of p(s) and Ai as well as processing 
of the subgoals B^i, . . . , -By-i were done, amongst others, by using a sequence of mgu's 
70, . . .,7j-i with the property that t = 570 . • -7j-i and <5 = (70 . . . 7j-i)| Var S ((Bij,...,B itni ))- 

An empty QSQ-net of P is a QSQ-net of P with all the sets of the forms tuples (v), 
unprocessed(v,w), subqueries(v), unprocessecLsubqueries(v), unprocessecLsubqueries2(v), 
unprocesse(Ltuples(v) being empty. 

In a QSQ-net, if v = pre.filter i or v = posLfilterj^ or u = filter^ and kind(v) = 
extensional then d has exactly one successor, which we denote by succ{v). 

If w is filter ij with kind(v) = intensional and pred[y) = p then u has exactly two 
successors. In that case, let 

/ x _ J filter i if nj > j 

1 post-filter i otherwise 

and succ2(v) = inpuLp. The set unprocesse<Lsubqueries(v) is used for (i.e. corresponds to) 
the edge (v,succ(v)), while unprocessecLsubqueries2(v) is used for the edge (v , succ2(v)) . 

Note that if succ(v) = w then posLvars(v) = pre.vars(w). In particular, 
posLvars (filter in .) = pre.vars(posLfilter A = 0. 

The formats of data transferred through edges of a QSQ-net are specified as follows: 

— data transferred through an edge of the form (inpuLp, v), (v, inpuLp), (v,ansjp) or 
(ans-p, v) is a finite set of generalized tuples of the same arity as p 



Query-Subquery Nets 9 



— data transferred through an edge (u, v) with v = filter j • and u not being of the form 
ansjp is a finite set of subqueries that can be added to subqueries(v) 

— data transferred through an edge (y, post. filter j) is a set of subqueries (t,e) such that t 
is a generalized tuple of the same arity as the predicate of A$. 

If (£, 8) and (£"', 8') are subqueries that can be transferred through an edge to v then we 
say that (£, 8) is more general than (£ , 5') w.r.t. v, and that (£"', <5') is less general than (£, 8) 
w.r.t. if there exists a substitution 7 such that £7 = £~' and (£7)| J , re _„ a 7vs(«) = <5'- 

Informally, a subquery (£, <5) transferred through an edge to v is processed as follows: 

— if v = filter^j, kind(v) = extensional and pred(v) = p then, for each £ E I{p), if 
atom{v)8 = Bij8 is unifiable with a fresh variant of p(t') by an mgu 7 then transfer the 
subquery (£7, (8j)\ post _ vars ^ v )) through (v,succ(v)) 

— if v = filter kind(v) = intensional and pred(v) = p then 

• transfer the input tuple £ such that p(t) = atom(v)8 = B{j8 through (v,inpuLp) 
to add a fresh variant of it to tuples (inpuLp) 

• for each currently existing £ € tuples (ansjp) , if atom{v)8 = B{j8 is unifiable with 
a fresh variant of p(t ) by an mgu 7 then transfer the subquery (£7, (5j)\ pos t_ vars ( v )) 
through (v,succ(v)) 

• store the subquery (t,5) in subqueries (v), and later, for each new £ added to 
tuples (ans-p), if atom(v)8 = Bij8 is unifiable with a fresh variant of p(t') by an 
mgu 7 then transfer the subquery (£7, (8^)^^^^) through (v , succ{v)) 

— if v = post-filter i and p is the predicate of Ai then transfer the answer tuple £ through 
(postFilteri, ansjp) to add it to tuples (ans-p). 

Formally, the processing of a subquery is designed more sophisticatedly so that: 

— every subquery / input tuple / answer tuple subsumed by another one is ignored 

— every subquery / input tuple / answer tuple with term-depth greater than I is ignored 

— the processing is divided into smaller steps which can be delayed to maximize flexibility 
and allow various control strategies 

— the processing is done set-at-a-time (e.g., for all the unprocessed subqueries accumulated 
in a given node). 

Procedure transf er(D, u, v) (given on page [TO]) specifies the effects of transferring 
data D through an edge (n, v) of a QSQ-net. If v is of the form pre-filter i or post.filter i 
or (v = filter i j and kind(v) = extensional and T(v) = false) then the input D for 
v is processed immediately and appropriate data r is produced and transferred through 
(v,succ(v)). Otherwise, the input D for v is not processed immediately, but accumulated 
into the structure of v in an appropriate way. 

Function active-edge(u, v) (given on page fT2j) returns true for an edge (u,v) if data 
accumulated in u can be processed to produce some data to transfer through (u,v), and 
returns false otherwise. 

In the case active-edge(u, v) is true, procedure f ire(n, v) (given on page!12p processes 
data accumulated in u that has not been processed before to transfer appropriate data 
through the edge (u,v). 

Algorithm [1] (given on page !13p presents our QSQN evaluation method for Horn knowl- 
edge bases. 

3.1 Relaxing Term-Depth Bound 

Suppose that we want to compute as many as possible but no more than k correct answers 
for a query (P,q(x)) on an extensional instance I within time limit L. Then we can use 
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Procedure transfer(Z), u, v) 



Global data: a Horn knowledge base (P, I), a QSQ-net N = (V, E, T, C) of P, and a term-depth 
bound I. 

Input: data D to transfer through the edge (u, v) G E. 

1 if D = then return; 

2 if u is inpuLp then 

3 
4 
5 
6 



r~ 0; 

foreach t £ D do 

if and atom(v) are unifiable by an mgu 7 then 
|_ add-subquery(?7, 7| pol ,t.„ ora („), -T, succ(u)) 



transf er(_T, w, swcc(w)) 

8 else if w is ans_p then unprocessea\.tuples(v) := unprocesseaLtuples(v) U D 

9 else if « is inpuLp or ansjp then 



10 
11 
12 

13 
14 
15 
16 

17 
18 
19 
20 
21 
22 



foreach t € D do 

let I be a fresh variant of t; 

if t' is not an instance of any tuple from tuples (v) then 
foreach t" G tuples (v) do 

if t is an instance of t then 
delete t" from tuples (v); 

foreach (v,w) G E do delete t" from unprocessed (v,w) 

if w is inpuLp then 
add t' to tuples (v); 

foreach (w,w) G E do add I' to unprocessed(v,w) 

else 

add I to tuples (v); 

foreach (v, w) G E do add I to unprocessed (v, w) 



23 else if v «s filter i j and kind(v) = extensional and T(v) = false then 
let p = pred(v) and set _T := 0; 
foreach (7, 8) € D do 

if term-depth(afom(t;)<5) < / then 
foreach t' G I(p) do 

if atom(v)& is unifiable with a fresh variant of p(t') by an mgu 7 then 
j_ add-subquery(l7, (5^)i post _ vars(v) , T, succ(v)) 



24 
25 
26 
27 
28 
29 



transf er(_T, w, succ(i>)) 

31 else if v is filter and (kind(v) — extensional and T(v) — true or kind(v) = intensional) then 

32 foreach (t, 8) G D do 

33 if term-depth(atom(u)(5) < I then 

34 if no subquery in subqueries (v) is more general than (t,S) then 

35 delete from subqueries (v) all subqueries less general than (t,8); 

36 delete from unprocesse<Lsubqueries{v) all subqueries less general than (1,8); 

37 add (1,8) to both subqueries (v) and unprocesse<Lsubqueries(v); 

38 if kind(v) = intensional then 

39 delete from unprocessexLsubqueries2 (v) all subqueries less general than (t, 8) ; 

40 add (t, 8) to unprocesse<Lsubqueries2 (v) 



41 else // v is of the form posLfilter i 

42 

43 



r~{t\(t,e)£D}; 
transf er(_T, v, succ(v)) 
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Procedure add-subquery(t, 5, r, v) 

Purpose: add the subquery (t, 8) to F, but keep in F only the most general subqueries w.r.t. v. 

1 if term-depth(t) < I and term-depth(<5) < I and no subquery in F is more general than (1,(5) 
w.r.t. v then 

2 delete from F all subqueries less general than (t, 8) w.r.t. v; 

3 add (t, 8) to r 



Procedure add-tuple(i, r) 
Purpose: add the tuple t to F, but keep in F only the most general tuples. 

1 let t be a fresh variant of t; 

2 if t' is not an instance of any tuple from r then 

3 delete from r all tuples that are instances of t ; 

4 add T to r 



iterative deepening search which iteratively increases term-depth bound for atoms and 
substitutions occurring in the computation as follows: 

1. Initialize term-depth bound I to (or another small natural number). 

2. Run Algorithm [1] for evaluating (P,q(x)) on I within the time limit. 

3. While tuples (ans-q) contains less than k tuples and the time limit was not reached yet, 
do: 

(a) Clear (empty) all the sets of the form tuples (inpuLp) and subqueries (filter \ ,). 

(b) Increase term-depth bound I by 1. 

(c) Run Algorithm Q] without Steps 1 and 2. 

4. Return tuples (ans-q). 

4 Soundness and Completeness 

The following lemma states a property of Algorithm [TJ Its proof is straightforward. 

Lemma 4.1. Consider a run of Algorithm^ (using parameter I) on a query (P,q(x)) and 
an extensional instance I and let (V, E, T, C) be the resulting QSQ-net. Let v = filter j ■ 
for some 1 < i < m and 1 < j < ni. Let w = succ(v) and let u = filter i j_ 1 if j > 1, 
and u = pre.filter i otherwise. Suppose that a subquery (s',5 1 ) was transferred through (v,w) 
at some step k. Then a subquery (s,5) was transferred through (u,v) at some earlier step 
h < k with the property that: 

— if kind (v) = extensional and pred(v) = p then there exists t! £ L(p) such that atom(v)8 
is unifiable with a fresh variant of p(t ) by an mgu 7, s' = 57 and 5' = {o~"l)\ p0 st_vars(v) 

— if kind(v) = intensional and pred(v) = p then there was t € tuples(ansjp) at step k 
such that atom(v)5 is unifiable with a fresh variant of p(t ) by an mgu 7, s 1 = S7 and 

5 (^7) \post^vars(v) ■ ^ 

Theorem 4.2 (Soundness). After a run of Algorithm^ on a query (P,q(x)) and an 
extensional instance I , for all intensional predicates p of P, every computed answer t € 
tuples(ansjp) is a correct answer in the sense that P L) L \= V(p(i)). 

Proof. We prove P L) L \= \/(p(t)) by induction on the number of the step at which t was 
added to tuples(ansjp). Suppose t was added to tuples(ansjp) as the result of transferring t 
through the edge (post-filter ^ ans-p), which was triggered by the transfer of (t, e) through 
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Function active-edge(u, v) 

Global data: a QSQ-nct N = (V, E, T, C). 
Input: an edge (u, v) G E. 

Output: true if there are data to transfer through the edge («,«), and false otherwise. 

1 if u is prcfilter^^ or posLfilter i then return false 

2 else if u is inpuLp or ansjp then return unprocessed (u,v) ^ 

3 else if u is filter i j and kind(u) = extensional then 

4 | return T(u) — true A unprocessedLsubqueries(u) ^ 

5 else // u is of the form filter i ^ and kind(u) = intensional 

6 let p = pred(u); 

7 if v = inpuLp then return unprocessexLsubqueries% (u) / 

8 else return unprocessexLsubqueries (u) V unprocessecLtuples (u) ^ 



Procedure fire(n, v ) 



Global data: a Horn knowledge base (P, I), a QSQ-net N = (V, E, T, C) of P, and a term-depth 
bound I. 

Input: an edge (u,v) £ E such that active-edge (u, v) holds. 
i if u is inpuLp or ansjp then 

2 
3 



transfer {unprocessed{u, v), u, v); 
unprocessed(u, v) := 



4 else if u is filter i and kind(u) = extensional and T(u) = true then 



5 
6 
7 
8 
9 

10 
11 



let p = pred(u) and set P := 0; 
foreach (t,8) G unprocessecLsubqueries(u) do 
foreach i' £ J(p) do 

if atom(u)S is unifiable with a fresh variant of p(t) by an mgu 7 then 
|_ add-subquery(fry, (5j)\ post _ vars{u) , T, v) 



unprocessedLsubqueries(u) := 0; 
transf er(_T, u, v) 



12 else if u is filter i and kind(u) = intensional then 



13 
14 
15 
16 
17 
18 
19 
20 
21 



23 
24 
25 
26 
27 



let p = pred(u) and set P := 0; 
if v — inpuLp then 

foreach (t, S) £ unprocessecLsubqueriesg (u) do let p(t') = atom(u)5, add-tuple(l', r); 
unprocesse<Lsubqueries2 (u) := 0; 
else 

foreach (t,S) G unprocesseaLsubqueries(u) do 
foreach t' G tuples (ansjp) do 

if atom(u)5 is unifiable with a fresh variant of p(t') by an mgu 7 then 
|_ add-subquery(Fy, (8-y)\ pos ^ vars{u) , T, v) 

unprocessetLsubqueries(u) := 0; 

if unprocessecLtuples (u) 7^ then 

foreach t G unprocesse<Ltuples{u) do 
foreach (t',8) G subqueries (u) do 

if atom(u)8 is unifiable with a fresh variant of p(t) by an mgu 7 then 
|_ add-subquery(?7, {8~i)\ post _ vars{u) , T, v) 

unprocessecLtuples (u) := 
transf er(_T, u, v) 
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Algorithm 1: for evaluating a query (P, q(x)) on an extensional instance I. 

1 let (V,E,T) be a QSQ-net structure of P; 1 1 T can be chosen arbitrarily 

2 set C so that N = (V, £, T, C) is an empty QSQ-net of P; 

3 let be a fresh variant of x; 

4 tuples (inpuLq) := {a 7 '}; 

5 foreach (input-q, v) £ -E do unprocessed(inpuLq, v) := {x'}; 

6 while i/iere exists (u,v) £ _E suc/i i/iat active-edge(w, Zio/rfs do 

7 select (u,v) £ E such that active-edge(it, v) holds; 

// any strategy is acceptable for the above selection 

8 fire(«,w) 

9 return tuples (ansjq) 



the edge (filter i nj , post-filter Let s ni = t and <5 ni = e. Let t>o = pre-filter i and t>j = filter^j 
for 1 < j < rii. By Lemma l4.lt f° r each j from rij to 1, there exists a subquery (sj—i, 
transferred through (vj-i, Vj) such that: 

if kind(vj) = extensional and pred(vj) = pj then there exists i- € /(pj) such 

that atom(vj)5j-i is unifiable with a fresh variant of Pj(t'j) by an mgu 7j, (1) 

*?' = *J-l7i and ^ = (^7-l7j')|post.wors(t;j) 

if kind(vj) = intensional and pred(vj) = pj then there exists ij € 
tuples (ansjpj) such that atom(vj)5j-i is unifiable with a fresh variant of Pj(tj) (2) 
by an mgu 7,-, s,- = Sj-iTj- and Sj = {5j-ijj)\ post _ vars ( Vj )- 

We have that Ai5o = p(so). We prove by an inner induction on 1 < j < nj + 1 that: 

for every substitution 0, if PUJ |= V((Pj j A. . .ABi n .)8j-\9) then . . 

PU/hV(p(5 H )fi)- (j 

Base case (j = 1): Since PUJ |= V(^), we have PUI |= V((Pi,i A . . .A5 iin , -» ^)£ o 0). 
Hence, if P U I |= V((Pj,i A ... A B i}ni )5 0) then P U I \= V(Aj5 o 0), which means P U I \= 
V(p(a o )0). 

Induction step: Suppose the induction hypothesis holds for j < n«, i.e., 

for every 0, if PU/ (= \/((B i:j A. . .AB im )5j-xO) then PU/ |= V(p(sj_i)0). (4) 
We show that it also holds for j + 1, i.e., 

for every 0', if PUI \= V((B ijj+1 A. . .AB^S^') then PU/ (= V(p(sj)0O- (5) 

Suppose 

P U I H V((P ij+ i A ... A Bi^SjO'). (6) 

Take = 7 i / . 

— Consider the case kind(vj) = extensional and let pj = pred(vj). By (P), there exist 
?j G PvPj) an d a fresh variant t'j of such that jj = mgu(Bi ) jSj-i,pj(tj)), ~s~j = Sj-i7j 

and Sj = (5j-i7j)|p at.«ora(« J -)- We nave Pu/ N v (Pj(*j))> hence P U / ^ V(Pj(?j)lj), 
which means P U 7 |= V(Bi ) j6j-.i'yj). Hence P U I |= V(Pjj5j_i7j0'), which means 

PU/HV(Pi,A-^). (7) 
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Since Sj = (Sj-ijj)\ p0S t_ mrs ( V:j ) and 9 = jjB', we have that 

(B iJ+1 A ... A B i}ni )5j0' = (B iJ+1 A ... A B im )8j-i9. 

This together with ©, (0) and © implies PUJ |= V(p(sj_i)0). Since Sj- X 9 = Sj-UjO' = 
~s~jO' , it follows that P U I \= \/{p(sj)9'), which completes the proof of ([5]) for the case 
kind(vj) = extensional. 
— Consider the case kind(vj) = intensional and let pj = pred(vj). By ([2]), there exist 
t'j G tuples (ansjpj) and a fresh variant tj of tj such that jj = mgu(Bi t jSj-i,pj(tj)), 
Sj = Sj_i7j and 5j = (^•-i7j)| p0 s^ws(c : ,-)" ^ ^ e m ductive assumption of the outer 
induction, we have PU I \= V(pj(t'j)), hence P U I (= \/(pj(t'j)jj), which means P U I |= 
V(Bij5j-i~fj). Analogously as for the above case, we can derive that PUl \= V(p(sj)9'), 
which completes the proof of ([5]) and ©. 

By ([3]), when j = rn + 1 and 9 = e, we have that P U 7 |= V(p(s nj )), which means 

pu/HV(p(i)). 

We need the following lemma for the completeness theorem. We assume that the sets 
of fresh variables used for renaming variables of input program clauses in SLD-refutations 
and in Algorithm [1] are disjoint. 

Lemma 4.3. After a run of Algorithm [I] (using parameter I) on a query (P,q(x)) and an 
extensional instance I, for every intensional predicate r of P, for every s € tuples (inpuLr) 
and for every SLD -refutation of PL) I U {<— r(s)} that uses the leftmost selection function, 
does not contain any goal with term-depth greater than I and has a computed answer 9 with 
the term-depth ofs9 not greater than I, there exists s' G tuples (ansur) such that s9 is an 
instance of a variant ofs'. 

Proof. We prove this lemma by induction on the length of the mentioned SLD-refutation. 
Let 0i, ... ,0^ be the sequence of mgu's used in the refutation. We have that r(s)9i . . .9^ = 
r(s)9. Suppose that the first step of the refutation of P U I U {<— r(s)} uses an input 
program clause ip\ = (A • ^— B[ 1 , . . . , B • n . ) , which is a variant of a program clause (fi = 
(A{ B{ t x, . . . , -Bj jTli ) of P, resulting in the resolvent <— (B^ x , . . . , B' in )9\. Let k\ = 2, 
/c ni+ i = h + 1 and suppose that, for 1 < j < rij, 

the fragment for processing <— P^j^i • • • of the refutation of . . 

PU/U{^r(s)} uses mgu's kj ,'- ■ ■ ,&k j+1 -l- ( ' 

Thus, after processing the atom B' i j_ 1 for 2 < j < ni + 1, the next goal of the refutation 
of <— r(s) is 4— (Blj, . . . , B' i n .)9i . . . 9k j -i- (If j = ni + 1 then the goal is empty.) 

Let g be a renaming substitution such that ip^ = (fiQ. Thus, B[ ■ = Bijg for 1 < j < ni. 
We can assume that g does not use any variable occurring in s. 

We will refer to the data structures used by Algorithm [TJ 

We first prove the following remark: 

Remark 4-4- Let 1 < j < ^i, v = filter i j, u = filter i j__ 1 if j > 1, and u = pre.filter i 
otherwise. If (ij_x>^'-i) is a subquery transferred through (u,v) at some step and there 
exists a substitution n such that 

(A, • • • , B itrii )) Q 9i . . . 9 kj -i = (r(tj-i), (Bij, . . . , B^Sj^r) (9) 

then there exist a subquery (tj,5j) transferred through (v,succ(v)) at some step and a 
substitution n' such that 



(A i ,(B itj+1 ,...,B i!ni )) 6 1 ...0 k . +1 _ 1 = (r{t j ),(B itj+1 ,...,B i)ni )5 j )ri / (10) 
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Suppose the premises of this remark hold. Without loss of generality we assume that: 

if (kind(v) = extensional and T(y) = true) or kind(v) = intensional , . 
then the subquery <5j-i) was added to subqueries(v). 

Since B[ ■ = Bijg and Q, we have that: 

(<- 5^-01 . . . = («- B iyjQ e l . . . e kj ^) = («- BijSj-w). (12) 

Since the term-depth of Bij5j-\r] = B[ a9\ ■ ■ ■ Qkj-i is n °t greater than I, the term-depth of 
BijSj-i is also not greater than I. By Q, (|12|) and Lifting Lemma |2, 21 we have that 

there exists a refutation of PU/U {^— using the leftmost 

selection function and mgu's 9' k ., . . . such that the term-depths 

of goals are not greater than i and 77$^ . . . ^ J+1 -i = 9' k . . . . 9' k+i _ 1 [i f° r 
some substitution fi. 

Consider the case when the predicate p = pred(v) of Bij is an extensional predicate. 
Thus, 

k j+1 = kj + l (14) 

and 

BijSj-^^ptf)*^ (15) 

where p(t')a is the input program clause used for resolving ^— BijSj—i, with t € /(p) 
and a being a renaming substitution. Regarding the transfer of the subquery (tj-i, 
through (u,v), under the assumption ([TT]) . Algorithm [1] unifies atom(v)5j-\ = BijSj-i with 
a fresh variant p(t')a' oi p(t), where a' is a renaming substitution, resulting in an mgu 7 
(by (|15p . Bij5j—i and p(t')a' are unifiable, which is also justified below) and then transfers 
the subquery (ij-17, (^-i7)|p 0s t«ors(«)) through (v,succ(v)). Let 

*j=*j-l7 and *j = (£j-l7)|post.wtrs(i0- ( 16 ) 
We have that a = cr'cr" for some renaming substitution cr" such that 

a" does not use variables oitj—i, Sj-i and pre-vars(v). (17) 

Thus Bij8j-\a"0' k . = Bij5j-i9' k ^, and by (fT5|) and the fact a = e'er", we have that 

(B iij 5^ 1 )a"9' k . = />',,<>, A.. =p(t')o9' k . = (p(t')a')a"9' k] . 

Hence, BijSj—i and p(t)a' are unifiable using cr"9' k ., while 7 is an mgu for them. Hence 

= 7M' (18) 
for some substitution //. Let r/ = /u'//. We have that: 

(Ai, (Bi 

= ({Ai, (Bij + i, . . . ,Bi jTli ))g9i . . . 9 k ._i)9 kj . . . 9 kj+1 -i 

= (r(?j_i), (Bij+i, B itni )5j-i)r]9 kj . . . 9k j+1 -i (by the assumption ©) 

= (r^_i), (B iJ+ i, . . . , BvOVi)**, • • • ^ +1 -ia* ( b y C3D) 

= (rft_i), . . . , B^i-iK^ • • • 0' k]+1 -^ (by (HZD) 

= (r(i,-_i), (By+i, • • • , Bi^Sj-^'n (by dD and (QSD) 

= (r(tj), (Bij+i, • • • , B i>TH )6j)r)' (by ((TSJ) and the fact 77' = fi'/j,). 
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We have shown (|10p and thus proved Remark 14.41 for the case when the predicate of Bij is 
extensional. 

Now consider the case when the predicate p of Bij is an intensional predicate. 

By the assumption (fTTj) . the subquery (tj—i, Sj-i) was also added to 
unprocessecLsubqueries2(v). Let BijSj-i = p(t'). There must exist some tuple t more 
general than t that was added to tuples {inpuLp) at some step. Let to = t' for some 
substitution a that uses only variables from t and t . Thus, 

By$i_i = P®<* ( 19 ) 

By (|13p and Lifting Lemma [2.2l it follows that there exists a refutation of PU/U{^— p(t)} 
using the leftmost selection function and mgu's 9 k ., . . . , 6 k . 1 _ 1 such that the term-depths 
of the goals are not greater than I and 

<xO' kr ..e> k . +1 _ 1 = el....e'l. +1 _ 1 p (20) 

for some substitution j3. By the inductive assumption, tuples(ansjp) contains a tuple t such 
that t6 k . . . . is an instance of a variant of t . Since 



BijSj.!^. . . . e' k . +1 _ x = p{t)a9' k . . . . 9' k . +i _ x (by m 

II 



pm r --% +1 -iP (by urn 

it follows that 

BijSj-i9' k . . . . 0' k _i is an instance of a variant of p(t"). (21) 

From certain moment there were both Sj-i) £ subqueries(v) and t" € 

tuples (ansjp). Hence, at some step Algorithm [1] unified atom(v)(5j-i) = Bij5j-i with a 
fresh variant p(t")a of p(t"), where a is a renaming substitution. The atom p(t")a does not 
contain variables of ij-i, Sj-i, pre.vars(v) and 0' k . . . . d' k _ v By ([2"T|) . BijSj-i and p(t )a 
are unifiable. Let the resulting mgu be 7 and let 

*j=*j-l7 and 5j = {5j-i^)\ post _ va rs(v)- (22) 

Algorithm [T] then transferred the subquery (tj,Sj) through (v,succ(v)). 

By (f2Tj) . Bij5j-i9' k . ■■■9' k . +i _ l is an instance of p(t" )o\ Let p be a substitution with 

domain contained in Vars(p(t")a) such that Bij5j-i9' k , . . . 6' k +i _ 1 = p(f')ap. We have 
that 

the domain of p does not contain variables of £7-1, pre.vars(v) and 

6' ...6' ( 23 ) 

and . . . U p is a unifier for BijSj^i and p(t")a. As 7 is an mgu for Bijdj-i and 

p(i )<t, we have that 

7// = (^..A j+1 _ 1 U P ) (24) 
for some substitution p! . Let 7/ = p'p. We have that: 

1 j • • • ) Bi ni ))q9\ . . . 6k i+1 -i 
= (r(tj_i), (Bij+i, B i)ni )6j-i)0 k . . . . 9' kj+1 _ifi (as shown before) 

= (r(Vi), (Aj+i, • • • , J B J ,„J^-i)(4 J • • • 0' kj+1 -i U p)/u (by ®) 

= (rfo-i), (Aj+i, • • • , B^Sj^jp'p (by ([SI)) 

= (rfo), (Sij+i, . . . , Bi^jW (by d22j) and the fact 7/ = ///i). 
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We have shown (|10p and thus proved Remark 14.41 for the case when the predicate of Bij is 
intensional. This completes the proof of this remark. 

Recall that r(s)g = r(s). Since 9\ = mgu(r(s), A'A and A\ = AiQ, it follows that 
r(s)g6i = r(s)9i = A'fi\ = Aig9\ and hence g9\ is a unifier for r(s) and A4. Let 70 be the 
mgu Algorithm [1] uses to unify r(s) with Ai. Thus, 70770 = Q&i for some substitution 770. 
Moreover, (to, 80) = (S70, ( r yo)\ pr >L.vars(filter ■ {)) i s a subquery Algorithm [1] transferred through 
(pre_filter i , filter i x ). Recall that k\ = 2 and observe that the premises of Remark 14.41 hold 
for j = 1 and for the subquery (to, 60) using 77 = 770- Hence there exist a subquery 
transferred through (filter \ 1, filter i 2 ) at some step and a substitution 771 such that 

(A, (B i>2 , Bi^eOx . . . fe2 _i = (r(ii), (B ii2j • • • , Bi^S^Vi- 

For each 1 < j < m, we can apply Remark 14.41 to obtain a subquery (tj,5j) and r\j 
(for r/'). Since posLvars(filter in .) = 0, it follows that, for j = rij, we have that (t ni ,e) is a 
subquery transferred through (filter irii , post-filter A at some step and 

A i g0 1 ...e kn , +1 ^ 1 =r(t ni )rj ni . 

Since k ni+ i = h + 1 and = (#1 . . . 0^)i vbra(s)i ^ follows that 

r(a)0 = r(3)0i . . = ^^1 . . . ^ = ^^1 • • A = r^)^. 

Thus, s6 is an instance of t ni . Since (in i; s) was transferred through (filter ^ , post-filter 'J, 
tuples (ansjr) will contain s' such that t ni is an instance of a variant of s'. Clearly, s# is also 
an instance of that variant of s' . This completes the proof. < 

Theorem 4.5 (Completeness). After a run of Algorithm^ (using parameter I) on a 
query (P, q(x)) and an extensional instance I, for every SLD -refutation ofPUlU {-(— q(x)} 
that uses the leftmost selection function, does not contain any goal with term-depth greater 
than I and has a computed answer 8 with term- depth not greater than I, there exists 
s € tuples(ans-q) such that x9 is an instance of a variant ofs. < 

This theorem immediately follows from Lemma 14.31 Together with Theorem 12.11 (on 
completeness of SLD-resolution) it makes a relationship between correct answers of P U 
I Li q(x)} and the answers computed by Algorithm [1] for the query (P,q(x)) on the 
extensional instance /. 

For queries and extensional instances without function symbols, we take term-depth 
bound I = and obtain the following completeness result, which immediately follows from 
the above theorem. 

Corollary 4.6. After a run of Algorithm^ using I = on a query (P,q(x)) and an exten- 
sional instance I that do not contain function symbols, for every computed answer 9 of an 
SLD -refutation of P U I U {<— q(x)} that uses the leftmost selection function, there exists 
t G tuples(ans-q) such that x8 is an instance of a variant oft. < 

5 Data Complexity 

In this subsection we estimate the data complexity of Algorithm [H which is measured w.r.t. 
the size of the extensional instance / when the query (P, q(x)) and the term-depth bound 
/ are fixed. 

If terms are represented as sequences of symbols or as trees then there 
will be a problem with complexity. Namely, unifying the terms f(xi,...,x n ) and 
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f(g(xo, Xq), . . . , g(x n -\, x n -i)), we get a term of exponential length|f| If the term-depth 
bound I is used in all steps, including the ones of unification, then the problem will not 
arise. But we do not want to be so restrictive. 

To represent a term we use instead a rooted acyclic directed graph which is permitted to 
have multiple ordered arcs and caches nodes representing the same subterm. Such a graph 
will simply be called a DAG. As an example, the DAG of f(x, a, x) has the root rif labeled 
by /, a node n x labeled by x, a node n a labeled by a, and three ordered edges outgoing from 
n f : the first one and the third one are connected to n x , while the second one is connected 
to n a . 

The size of a term t, denoted by size(t), is defined to be the size of the DAG of t (i.e. 
the number of nodes and edges of the DAG of t). The sizes of other term-based expressions 
or data structures are defined as usual. For example, we define: 

— the size of a tuple (t%, ... ,tk) to be size(t\) + . . . + size(tk) 

— the size of a set of tuples to be the sum of the sizes of those tuples 

— the size of a substitution {x\/t\, . . . ,Xk/tk} to be k + size{t\) + . . . + sizeitk) 

— the size of a node v of a QSQ-net (V, E, T, C) to be the sum of the sizes of the compo- 
nents of C(v). 

Using DAGs to represent terms, unification of two atoms A and A' can be done in 
polynomial time in the sizes of A and A' . In the case A and A 1 are unifiable, the resulting 
atom and the resulting mgu have sizes that are polynomial in the sizes of A and A'. Similarly, 
checking whether A is an instance of A' can also be done in polynomial time in the sizes of 
A and A'. 

The following theorem estimates the data complexity of Algorithm [H under the assump- 
tion that terms are represented by DAGs and unification and checking instances of atoms 
are done in polynomial time. 

Theorem 5.1. For a fixed query and a fixed bound I on term-depth, Algorithm^ runs in 
polynomial time in the size of the extensional instance. 

Proof. Consider a run of Algorithm [1] using parameter I on a query (P,q(x)) and on an 
extensional instance / with size n. Here, (P, q(x)) and I are fixed. Thus, for every 1 < i < m, 
ni is bounded by a constant. Similarly, if p is an intensional predicate from P then the arity 
of p is also bounded by a constant. 

Observe that the number of tuples that are added to any set of the form tuples (inpuLp) 
or tuples (ansjp) are bounded by a polynomial of n. The reasons are: 

— intensional predicates come from P 

— constant symbols and function symbols come from P and I 

— tuples (inpuLp) and tuples {ansjp) consist of tuples with term-depth bounded by I 

— a tuple is added to a set of the form tuples {inpuLp) or tuples (ans..p) only when it is not 
an instance of a fresh variant of any tuple from the set 

— a tuple is deleted from a set of the form tuples (inpuLp) or tuples (ansjp) only when it is 
an instance of a new tuple added to the set. 

For similar reasons, the number of subqueries that are added to any set of the form 
subqueries{v) are also bounded by a polynomial of n. 

Consequently, the sizes of sets of the form tuples (inpuLp), tuples (ansjp), 
subqueries (v), unprocessed (v,w), unprocessecLtuples(v), unprocesse<Lsubqueries(v) or 
unprocessedsubqueries2(v) are bounded by a polynomial of n. Therefore, the size of the 

4 Another example is the pair f(xi, . . . , x n , xi, . . . ,x n ) and f(yi, ■ ■ ■ , y n ,g(yo, yo), ■ ■ ■ ,g(yn-i, Vn-i))- 
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constructed QSQ-net is bounded by a polynomial of n, and any execution of procedure 
transfer, procedure fire or function active-edge is done in polynomial time in n. 

A transfer or a firing for an edge (u, v) is done only when a new tuple was added 
to tuples (u) or a new subquery was added to subqueries(u). Thus, we can conclude that 
Algorithm [1] runs in polynomial time in n. <i 

Corollary 5.2. Algorithm^ with term-depth bound I = is a complete evaluation algo- 
rithm with PTIME data complexity for the class of queries over a signature without function 
symbols. < 

This corollary follows from Theorem l4.2l (on soundness), Corollary 14.61 (on completeness) 
and the above theorem (on data complexity). 



6 QSQ-Nets with Tail Recursion Elimination 

A query- subquery net structure with tail recursion elimination (in short, QSQTRE-net struc- 
ture) of P is a tuple (V, E, T) defined similarly to a QSQ-net structure of P, but with the 
following differences: 

— T is a function, called the type of the net structure, mapping 

• each filter E V such that the predicate of Bij is extensional to true or false 

• each intensional predicate to true or false. 

— If Ai and E$i ni have the same intensional predicate p with T(p) = true then V does 
not contain the node posLfilter \ and E does not contain the edges (filter \ , post-filter j), 
(post-filter^ ansjp) and (ansjp, filter i iTij ). 

The function T can thus be called a memorizing type for extensional nodes filter \ j (as in 
QSQ-net structures), and a tail-recursion-elimination type for intensional predicates. 

Example 6.1. Reconsider the positive logic program given in Example I3.lt 

p(x,y) <- q(x,y) 
p(x,y) <- q(x,z),p(z,y). 

A QSQTRE-net structure (V, E, T) of this program with T(p) = true has the topological 
structure illustrated in Figure [21 which is like a loop. < 




A query- subquery net with tail recursion elimination (in short, QSQTRE-net) of P is 
a tuple N = (V, E, T, C) defined similarly to a QSQ-net of P, but with the following 
differences: 
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— (V, E, T) is a QSQTRE-net structure of P 

— if v = inpuLp and T(p) = true then C(v) consists of: 

• tuple.pairs{v) : a set of pairs of generalized tuples of the same arity as p 

• unprocessed(v,w) for (v,w) E E: a subset of tuple.pairs(v) 

— if v = filter in ., kind(v) = intensional, pred(v) = p and T{p) = true then the structure 
C(v) does not contain unprocessecLsubqueries(v) and unprocessed.tuples(v). 

The intuition behinds a pair (i, i') £ tuple_pairs (inpuLp) is that: 

— t is a usual input tuple for p, but the intended goal at a higher level is ^— p(t') 

— any correct answer for P U / U {<- is also a correct answer for PU/U{f- p(t')} 

— if a substitution 9 is a computed answer of P U I U {•<— then we will store in ansjp 
the tuple t'O instead of t6. 

Data transferred through an edge of the form (inpuLp, v) or (v, inpuLp) in a QSQTRE- 
net (V, E, T, C), where p is an intensional predicate with T(p) = true, is redefined to be a 
finite set of pairs of generalized tuples of the same arity as p. 

We say that a tuple pair (t, t ) is more general than (^2,^2)) an d (^2)^2) is an instance 
of (t, t'), if there exists a substitution 8 such that (t,t')9 = (t2,t' 2 )- 

Other notions for QSQTRE-nets are defined similarly as for QSQ-nets. 

Procedure transf ez2(D, u, v ) (given on page |2T|) is a modified version of 
transf er(D, u, v) for dealing with tail recursion elimination. 

Let procedure fire2(n, v) be the modified version of fire(u,v) obtained by: 

— changing the calls of transfer by calls of transf er2 (with the same parameters) 

— replacing Step [15] by macro compute-gamma defined on page 1211 

Algorithm [2] (given on page [22]) is our reformulation of Algorithm [1] by using QSQTRE- 
nets for evaluating queries. 

Theorem 6.2. Theorems \4-%\ \4-5[ and Corollaries \4-6\ \5.S\ still hold when " 'Algo- 
rithm^' is replaced by "Algorithm^". < 

7 Control Strategies 

Recall that in Algorithms [T] and [2] we repeatedly select an active edge and fire the operation 
for it. Such selection is decided by the adopted control strategy, which can be arbitrary. 
In this section we describe two control strategies: the first one is to reduce the number of 
accesses to the secondary storage, while the second one is depth-first search, which gives 
priority to the order of clauses in the positive logic program defining intensional predicates 
and thus allows the user to control the evaluation to a certain extent. 

7.1 Reducing the Number of Accesses to the Secondary Storage 

It is reasonable to assume that the computer memory is not large enough to load the whole 
extensional instance of the knowledge base into it and evaluation of queries cannot usually 
be done totally in the computer memory. Note that, not only extensional relations may be 
too large, but temporary relations used for computing intensional predicates like tuples(v), 
unprocessed (v,w), subqueries(v), . . . may also be too large. Therefore, sometimes we have 
to load a relation into the computer memory, and sometimes we have to unload a relation to 
the secondary storage. As access to the secondary storage is time-consuming, it is desirable 
to reduce the total number of such accesses. Here is a strategy for this: 
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Procedure transfer2(L>, u, v) 



Global data: a Horn knowledge base (P, 7), a QSQTRE-net N = (V, E, T, C) of P, and 

a term-depth bound I. 
Input: data D to transfer through the edge (u, v) 6 E. 

1 if D = then return; 

2 if u is inpuLp and T(p) = true then 

3 r 

4 
5 
6 



foreach (t, t') £ D do 

if p(t) and atom(v) are unifiable by an mgu 7 then 
[_ add-subquery (t 7, 7| pos( _„ ars („) , P, succ(u)) 

transf er(P, u, succ(n)) 
s else if « is inpuLp and T(p) = true then 
9 

10 
11 
12 
13 

14 
15 

16 
17 



foreach (t, t') £ P do 

let (t2,?2) De a fresh variant of (t, I'); 

if (t2,t2) * s an instance of any pair from tuplejpairs{v) then 
foreach (ts,t 3 ) £ tuple.pairs{v) do 

if (£3, £3) is an instance of (t,2,t,2) then 
delete (£3, £3) from tuplejpairs(v); 

foreach (v,w) 6 E do delete (13,73) from unprocessed (v,w) 
add (£2, £2) to tuple_pairs(v); 

foreach (v,w) S P do add (fe,^) to unprocessed (v,w) 



is else if u is filter i n ., kind(v) — intensional, pred(v) — p and T(p) = true then 

19 foreach (t, S) 6 D do 

20 if term-depth(atom(u)5) < I then 

21 if no subquery in subqueries (v) is more general than (t,S) then 

22 delete from subqueries (v) all subqueries less general than (t, 8); 

23 delete from unprocessecLsubqueriesg (v) all subqueries less general than (t, 5); 

24 add (t, 5) to both subqueries (v) and unprocesse<Lsubqueries2 (v) 



25 else 



Steps I2TS1 of procedure transfer (given on page llUp with the recursive calls of transfer 
replaced by calls of transf er2 



Procedure add-tuple-pair(t, t' , P) 

Purpose: add the pair of tuples (t, t ) to r, but keep in r only the most general pairs. 

1 let (t2,t' 2 ) be a fresh variant of (i, t')\ 

2 if (t2,t' 2 ) is not an instance of any pair from P then 

3 delete from r all pairs that are instances of (?2,?2); 

4 add (12,72 ) to P 



Procedure compute-gamma 

Purpose: a macro used in procedure fire2 for replacing Step [15] of procedure fire. 

1 if T(p) = false then 

2 I foreach (t,8) 6 unprocesseaLsubqueriess (it) do let p(t') = atom(u)S, add-tuple(t', P) 

3 else if j < Hi then 

4 I foreach (t, S) £ unprocessecLsubqueries2 (u) do let p(t') = atom(u)5, add-tuple-pair(l',l', r) 

5 else 

6 I foreach (t,6) £ unprocessed_subqueries2 (u) do let p(t') = atom(u)S, add-tuple-pair(t',7, P) 
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Algorithm 2: for evaluating a query (P, q(x)) on an extensional instance I. 

let (V, E, T) be a QSQTRE-net structure of P\ II T can be chosen arbitrarily 

set C so that TV = (V, E, T, C) is an empty QSQTRE-net of P; 

let x' be a fresh variant of x; 
if T(q) = false then 

tuples \input.q) := {x'}; 

foreach (input-q, v) 6 E do unprocessed(inpuLq, v) := {x 1 } 

else 

tuplejpairs {inpwLq) := {(a;', a;')}; 
_ foreach (inpuLq,v) £ -E do unprocessed(inpuLq, v) := {(x' ,x')} 

while t/iere exists (u,v) £ 13 suc/i i/iat active-edge(u, «) holds do 
select (u, «) 6 E such that active-edge(u, w) holds; 
// any strategy is acceptable for the above selection 
f ire2(u, v) 

return tuples(ansjq) 



— If (u, v) and (u', v') are active edges of the considered QSQ-net/QSQTRE-net and firing 
the edge (u, v) can be done in the computer memory, while firing the edge (u', v') 
requires loading some relations from the secondary storage then the edge (u, v) has a 
higher priority than (u',v') (for being selected). 

— If firing any of edges (u, v) and (u',v') can be done in the computer memory then: 

• the one that could enable a next operation be done in the computer memory (e.g. 
firing some edge (v,w) or (v',w')) is considered to have a higher priority than the 
other 

• if both of the edges are equal w.r.t. the above criterion then the one that could 
enable more next operations be done in the computer memory is considered to have 
a higher priority than the other 

• if both of the edges are equal w.r.t. the above criteria then the one that processes 
more tuples/subqueries is considered to have a higher priority than the other. 

— If no more operations can be done in the computer memory without loading relations 
from the secondary storage then select and load such a relation. The criteria for such 
selection are similar to the above mentioned ones. That is, we choose a relation to load 
into the computer memory that would enable more next operations be done in the 
computer memory and that would process more tuples/subqueries. 

— If we want to load a relation into the computer memory but there is not enough available 
space in it then we have to select and unload an in-memory relation to the secondary 
storage. We can choose the in-memory relation that has not been used in the longest 
period to unload. 

7.2 Depth-First Evaluation 

The user may use Prolog programming style to specify the positive logic program defining 
intensional predicates. In such cases, e.g. as in Example ll.il the order of the program clauses 
may be essential and depth-first search may increase efficiency of query evaluation. 

For each node of the considered QSQ-net/QSQTRE-net we maintain and update its 
modification timestamp. For the depth-first evaluation approach, nodes are considered in 
the decreasing order of modification timestamps. When a node v is considered, we choose 
an active edge (v, w) to fire. If there is no such an edge, a next node in the mentioned order 
is chosen for consideration. If there are more than one successor w of v such that the edge 
(v, w) is active, choose an edge (v, w) according to the following strategy: 
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— If v = inpuLp then w is the node pre-filter i with the smallest index i such that (v, w) 
is active (i.e. we consider the program clause ipi with the smallest index i such that the 
edge (v, pre-filteTj) is active). 

— If v = filter kind{v) = intensional, v has two successors and both the edges 
(v,succ(v)) and (v, succ2(v)) are active, then w = succ(v)^ 

— If v = ansjp then w is the successor of v with the biggest modification timestamp such 
that (v, w) is active. 

8 Conclusions 

We have provided the first framework for developing algorithms for evaluating queries to 
Horn knowledge bases with the properties that: the approach is goal-directed; each subquery 
is processed only once and each supplement tuple, if desired^!, is transferred only once; op- 
erations are done set-at-a-time; and any control strategy can be used. The framework forms 
a generic evaluation method called QSQN. We have proved soundness and completeness of 
our generic evaluation method and showed that, when the term-depth bound is fixed, the 
method has PTIME data complexity. 

This work is a continuation of [7J . It makes essential improvements: while the QSQR 
evaluation method of [7] uses iterative deepening search and does redundant recomputa- 
tions, the QSQN evaluation method developed in this paper allows any control strategy 
and does not do redundant recomputations. The QSQN evaluation method is much more 
flexible, e.g., for reducing the number of accesses to the secondary storage. 

Our framework is an adaptation and a generalization of the QSQ approach of Datalog 
for Horn knowledge bases. One of the key differences is that we do not use adornments and 
annotations, but use substitutions instead. This is natural for the case with function sym- 
bols and without the range-restrictedness condition. When restricting to Datalog queries, 
it groups operations on the same relation together regardless of adornments and allows 
to reduce the number of accesses to the secondary storage although "joins" and "projec- 
tions" would be more complicated. QSQ-nets are a more intuitive representation than the 
description of the QSQ approach of Datalog given in [I]. Our notion of QSQ-net makes a 
connection to flow networks and is intuitive for developing efficient evaluation algorithms. 
For example, as shown in the paper, it is easy to incorporate tail recursion elimination into 
QSQ-nets, and as a result we have QSQTRE-nets. 

In comparison with the most well-known evaluation methods, our QSQN evaluation 
method is more efficient than the QSQR evaluation method (as it does not do redundant 
recomputations) and is more flexible and thus has essential advantages over the bottom-up 
evaluation method based on magic-set transformation and improved seminaive evaluation 
(as shown in Example 1 
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